Skip to content

Conversation

@RasmusWL
Copy link
Member

@RasmusWL RasmusWL commented Mar 31, 2022

This was a draft PR to ensure we don't merge it before we have agreed with JS/Ruby about the new XmlParsing concept, but they have said good for it, so we're ready to go 👍

With this PR, I have copied the query setup from JS (Ruby only has XXE query). This means that no query will alert on 'DTD retrieval', which I will postpone for future work (this PR is already quite big).

RasmusWL and others added 30 commits March 31, 2022 09:52
After internal discussion, these will replace the `XmlEntityInjection`
query, so we can have separate severities on DoS and the other (more
serious) attacks.

Note: These clearly don't work, since they are verbatim copies of the JS
code, but I split it into multiple commits to clearly highlight what
changes were made.
I changed a few QLdocs so they fit the style we have used in Python...
although I surely do regret having introduced a new style for how these
QLDocs look :D
Note that most of the testing happens in the framework specific tests,
with an inline-expectation test
Which doesn't raise that syntax error (at least not on my laptop)
Kept the test of SimpleXmlRpcServer, and kept the qhelp so it can be
used to write the new qhelp files
and remove the old copy, we don't need it anymore :)
I found this resource quite good myself at least :)
Since there are other XML vulnerabilities that are not about parsing,
this is more correct.
I forgot about the existing ones when I promoted it
I didn't find a good way to actually share the stuff, so we kinda just
have 2 things that look very similar :|
And it's not possible to provide a parser argument either
RasmusWL added 3 commits April 6, 2022 12:56
- `XMLEtree` to `XmlEtree`
- `XMLSax` to `XmlSax`
- `LXML` to `Lxml`
- `XMLParser` to `XmlParser`
This also means that the detection of the values passed to these keyword
arguments will no longer just be from a local scope, but can also be
across function boundaries.
@RasmusWL
Copy link
Member Author

RasmusWL commented Apr 6, 2022

I tried to see whether API::CallNode could be used to merge the xml.etree.ElementTree modeling, but I couldn't find a nice way to do this, so I left them in this slightly ugly state where they sort of share logic, but doesn't actually share the code.

@RasmusWL RasmusWL requested a review from a team April 7, 2022 13:30
@RasmusWL RasmusWL marked this pull request as ready for review April 20, 2022 11:48
@RasmusWL
Copy link
Member Author

Performance evaluation looks good 👍

Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thorough job. Only minor comments. It looks like you deleted a poc (this_directory_not_extracted)?

*/
private class FileAccessFromLxmlParsing extends LxmlParsing, FileSystemAccess::Range {
FileAccessFromLxmlParsing() {
this = API::moduleImport("lxml").getMember("etree").getMember(["parse", "parseid"]).getACall()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if a utility to get the function name could make this smoother (like MethodCallNode has).

@RasmusWL
Copy link
Member Author

RasmusWL commented May 9, 2022

It looks like you deleted a poc (this_directory_not_extracted)?

Since the old PoC was located with the framework testing code, and the framework testing code was moved to multiple folders, I decided to create a new top-level directory to contain that code: https://github.com/github/codeql/blob/714465bf39d97e31aa6f0a7aa01c57e16f3c3078/python/PoCs/XmlParsing/PoC.py -- I guess I should have mentioned this in a comment 😳 I'm not sure it's the best solution ever, since this means that the framework testing code is far removed from this PoC, but I also couldn't come up with a better solution 🤔

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 5 vulnerabilities.

Comment on lines +3149 to +3154
private DataFlow::TypeTrackingNode instance(DataFlow::TypeTracker t) {
t.start() and
result instanceof InstanceSource
or
exists(DataFlow::TypeTracker t2 | result = instance(t2).track(t2, t))
}

Check warning

Code scanning / CodeQL

Dead code

Code is dead
}

/** Gets a reference to an instance of `io.StringIO`/`io.BytesIO`. */
DataFlow::Node instance() { instance(DataFlow::TypeTracker::end()).flowsTo(result) }

Check warning

Code scanning / CodeQL

Dead code

Code is dead
@RasmusWL
Copy link
Member Author

RasmusWL commented May 9, 2022

Should have fixed the QL4QL alerts now as well 👍

@RasmusWL RasmusWL requested a review from yoff May 9, 2022 09:52
Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not want to drag this out over little refactoring, but since you now did the one for Lxml (that I recall considering), I want to suggest the only other one, I considered. (I think the two branches of elementTreeInstance are fine to keep apart, since they have different semantics).

Co-authored-by: yoff <lerchedahl@gmail.com>
@RasmusWL RasmusWL requested a review from yoff May 9, 2022 14:37
Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yoff yoff merged commit b6605bc into github:main May 9, 2022
@RasmusWL RasmusWL deleted the promote-xxe branch May 10, 2022 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants